Asymptotically efficient adaptive allocation rules

نویسندگان
چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Asymptotically efficient adaptive allocation rules for the multiarmed bandit problem with switching - Automatic Control, IEEE Transactions on

We consider multiarmed bandit problems with switching cost, define uniformly good allocation rules, and restrict attention to such rules. We present a lower bound on the asymptotic performance of uniformly good allocation rules and construct an allocation scheme that achieves the bound. We discover that despite the inclusion of a switching cost the proposed allocation scheme achieves the same a...

متن کامل

Asymptotically Efficient Allocation Rules for the - Multiarmed Bandit Problem with Multiple Plays - Part 11 : Markovian Rewards

At each instant of lime we are required to sample a fixed number rn 2 1 out of N Markov chains whose stationary transition probability matrices belong to a family suitably parameterized by a real number 8. The objective is to maximize the long run expected value of the samples. The learning loss of a sampling scheme corresponding to a parameters configuration C = (el,. .. , e, %*) is quantified...

متن کامل

Optimal Adaptive Equal Allocation Rules

Suppose we wish to decide which of two treatments is better, where the outcomes are Bernoulli random variables, the success probabilities of which, themselves, are modeled as independent beta random variables. Assume that the maximal population size for the experiment is xed, but that the length of the study and the number and order of patients assigned to each treatment may be random. Our goal...

متن کامل

Linear Parameter Estimation : Asymptotically Efficient Adaptive Strategies

This paper considers the problem of distributed adaptive linear parameter estimation in multiagent inference networks. Local sensing model information is only partially available at the agents, and interagent communication is assumed to be unpredictable. The paper develops a generic mixed time-scale stochastic procedure consisting of simultaneous distributed learning and estimation, in which th...

متن کامل

Asymptotically Efficient Adaptive Allocation Schemes for Controlled I.I.D. Processes: Finite Parameter Space

Abstruct-We consider a controlled i.i.d. process whose distribution is parametrized by an unknown parameter 8 belonging to some known parameter space 8, and a one-step reward associated with each pair of control and the following state of the process. The objective is to maximize the expected value of the sum of one-step rewards over an infinite horizon. By introducing the loss associated with ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Advances in Applied Mathematics

سال: 1985

ISSN: 0196-8858

DOI: 10.1016/0196-8858(85)90002-8